{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Sxt-9qpNgPxo"
      },
      "source": [
        "##### Copyright 2020 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Phnw6c3-gQ1f"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aalPefrUUplk"
      },
      "source": [
        "# FaceSSD Fairness Indicators Example Colab"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KFRBcGOYgEAI"
      },
      "source": [
        "\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca target=\"_blank\" href=\"https://www.tensorflow.org/responsible_ai/fairness_indicators/tutorials/Facessd_Fairness_Indicators_Example_Colab\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/fairness-indicators/blob/master/g3doc/tutorials/Facessd_Fairness_Indicators_Example_Colab.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/fairness-indicators/tree/master/g3doc/tutorials/Facessd_Fairness_Indicators_Example_Colab.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView on GitHub\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca href=\"https://storage.googleapis.com/tensorflow_docs/fairness-indicators/g3doc/tutorials/Facessd_Fairness_Indicators_Example_Colab.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "\u003c/table\u003e"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "UZ48WFLwbCL6"
      },
      "source": [
        "##Overview\n",
        "\n",
        "In this activity, you'll use [Fairness Indicators](https://www.tensorflow.org/tfx/guide/fairness_indicators) to explore the [FaceSSD predictions on Labeled Faces in the Wild dataset](https://modelcards.withgoogle.com/face-detection). Fairness Indicators is a suite of tools built on top of [TensorFlow Model Analysis](https://www.tensorflow.org/tfx/model_analysis/get_started) that enable regular evaluation of fairness metrics in product pipelines.\n",
        "\n",
        "##About the Dataset\n",
        "\n",
        "In this exercise, you'll work with the FaceSSD prediction dataset, approximately 200k different image predictions and groundtruths generated by FaceSSD API.\n",
        "\n",
        "##About the Tools\n",
        "\n",
        "[TensorFlow Model Analysis](https://www.tensorflow.org/tfx/model_analysis/get_started) is a library for evaluating both TensorFlow and non-TensorFlow machine learning models. It allows users to evaluate their models on large amounts of data in a distributed manner, computing in-graph and other metrics over different slices of data and visualize in notebooks.\n",
        "\n",
        "[TensorFlow Data Validation](https://www.tensorflow.org/tfx/data_validation/get_started) is one tool you can use to analyze your data. You can use it to find potential problems in your data, such as missing values and data imbalances, that can lead to Fairness disparities.\n",
        "\n",
        "With [Fairness Indicators](https://www.tensorflow.org/tfx/guide/fairness_indicators), users will be able to: \n",
        "\n",
        "* Evaluate model performance, sliced across defined groups of users\n",
        "* Feel confident about results with confidence intervals and evaluations at multiple thresholds"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "u33JXdluZ2lG"
      },
      "source": [
        "# Importing\n",
        "\n",
        "Run the following code to install the fairness_indicators library. This package contains the tools we'll be using in this exercise. Restart Runtime may be requested but is not necessary."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "EoRNffG599XP"
      },
      "outputs": [],
      "source": [
        "!pip install -q -U pip==20.2\n",
        "!pip install fairness-indicators"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "B8dlyTyiTe-9"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "import tempfile\n",
        "import apache_beam as beam\n",
        "import numpy as np\n",
        "import pandas as pd\n",
        "from datetime import datetime\n",
        "\n",
        "import tensorflow_hub as hub\n",
        "import tensorflow as tf\n",
        "import tensorflow_model_analysis as tfma\n",
        "import tensorflow_data_validation as tfdv\n",
        "from tensorflow_model_analysis.addons.fairness.post_export_metrics import fairness_indicators\n",
        "from tensorflow_model_analysis.addons.fairness.view import widget_view\n",
        "from tensorflow_model_analysis.model_agnostic_eval import model_agnostic_predict as agnostic_predict\n",
        "from tensorflow_model_analysis.model_agnostic_eval import model_agnostic_evaluate_graph\n",
        "from tensorflow_model_analysis.model_agnostic_eval import model_agnostic_extractor\n",
        "\n",
        "from witwidget.notebook.visualization import WitConfigBuilder\n",
        "from witwidget.notebook.visualization import WitWidget"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TsplOJGqWCf5"
      },
      "source": [
        "# Download and Understand the Data"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vFOQ4AaIcAn2"
      },
      "source": [
        "[Labeled Faces in the Wild](http://vis-www.cs.umass.edu/lfw/) is a public benchmark dataset for face verification, also known as pair matching. LFW contains more than 13,000 images of faces collected from the web.\n",
        "\n",
        "We ran FaceSSD predictions on this dataset to predict whether a face is present in a given image. In this Colab, we will slice data according to gender to observe if there are any significant differences between model performance for different gender groups.\n",
        "\n",
        "If there is more than one face in an image, gender is labeled as \"MISSING\".\n",
        "\n",
        "We've hosted the dataset on Google Cloud Platform for convenience. Run the following code to download the data from GCP, the data will take about a minute to download and analyze."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "NdLBi6tN5i7I"
      },
      "outputs": [],
      "source": [
        "data_location = tf.keras.utils.get_file('lfw_dataset.tf', 'https://storage.googleapis.com/facessd_dataset/lfw_dataset.tfrecord')\n",
        "\n",
        "stats = tfdv.generate_statistics_from_tfrecord(data_location=data_location)\n",
        "tfdv.visualize_statistics(stats)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cNODEwE5x7Uo"
      },
      "source": [
        "# Defining Constants"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ZF4NO87uFxdQ"
      },
      "outputs": [],
      "source": [
        "BASE_DIR = tempfile.gettempdir()\n",
        "\n",
        "tfma_eval_result_path = os.path.join(BASE_DIR, 'tfma_eval_result')\n",
        "\n",
        "compute_confidence_intervals = True\n",
        "\n",
        "slice_key = 'object/groundtruth/Gender'\n",
        "label_key = 'object/groundtruth/face'\n",
        "prediction_key = 'object/prediction/face'\n",
        "\n",
        "feature_map = {\n",
        "    slice_key:\n",
        "        tf.io.FixedLenFeature([], tf.string, default_value=['none']),\n",
        "    label_key:\n",
        "        tf.io.FixedLenFeature([], tf.float32, default_value=[0.0]),\n",
        "    prediction_key:\n",
        "        tf.io.FixedLenFeature([], tf.float32, default_value=[0.0]),\n",
        "}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gVLHwuhEyI8R"
      },
      "source": [
        "# Model Agnostic Config for TFMA"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ej1nGCZSyJIK"
      },
      "outputs": [],
      "source": [
        "model_agnostic_config = agnostic_predict.ModelAgnosticConfig(\n",
        "    label_keys=[label_key],\n",
        "    prediction_keys=[prediction_key],\n",
        "    feature_spec=feature_map)\n",
        "\n",
        "model_agnostic_extractors = [\n",
        "    model_agnostic_extractor.ModelAgnosticExtractor(\n",
        "        model_agnostic_config=model_agnostic_config, desired_batch_size=3),\n",
        "    tfma.extractors.slice_key_extractor.SliceKeyExtractor(\n",
        "          [tfma.slicer.SingleSliceSpec(),\n",
        "           tfma.slicer.SingleSliceSpec(columns=[slice_key])])\n",
        "]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wqkk9SkvyVkR"
      },
      "source": [
        "# Fairness Callbacks and Computing Fairness Metrics"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "A0icrlliBCOb"
      },
      "outputs": [],
      "source": [
        "# Helper class for counting examples in beam PCollection\n",
        "class CountExamples(beam.CombineFn):\n",
        "    def __init__(self, message):\n",
        "      self.message = message\n",
        "\n",
        "    def create_accumulator(self):\n",
        "      return 0\n",
        "\n",
        "    def add_input(self, current_sum, element):\n",
        "      return current_sum + 1\n",
        "\n",
        "    def merge_accumulators(self, accumulators): \n",
        "      return sum(accumulators)\n",
        "\n",
        "    def extract_output(self, final_sum):\n",
        "      if final_sum:\n",
        "        print(\"%s: %d\"%(self.message, final_sum))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "mRQjdjp9yVv2"
      },
      "outputs": [],
      "source": [
        "metrics_callbacks = [\n",
        "  tfma.post_export_metrics.fairness_indicators(\n",
        "      thresholds=[0.1, 0.3, 0.5, 0.7, 0.9],\n",
        "      labels_key=label_key,\n",
        "      target_prediction_keys=[prediction_key]),\n",
        "  tfma.post_export_metrics.auc(\n",
        "      curve='PR',\n",
        "      labels_key=label_key,\n",
        "      target_prediction_keys=[prediction_key]),\n",
        "]\n",
        "\n",
        "eval_shared_model = tfma.types.EvalSharedModel(\n",
        "    add_metrics_callbacks=metrics_callbacks,\n",
        "    construct_fn=model_agnostic_evaluate_graph.make_construct_fn(\n",
        "        add_metrics_callbacks=metrics_callbacks,\n",
        "        config=model_agnostic_config))\n",
        "\n",
        "with beam.Pipeline() as pipeline:\n",
        "  # Read data.\n",
        "  data = (\n",
        "      pipeline\n",
        "      | 'ReadData' \u003e\u003e beam.io.ReadFromTFRecord(data_location))\n",
        "\n",
        "  # Count all examples.\n",
        "  data_count = (\n",
        "      data | 'Count number of examples' \u003e\u003e beam.CombineGlobally(\n",
        "          CountExamples('Before filtering \"Gender:MISSING\"')))\n",
        "\n",
        "  # If there are more than one face in image, the gender feature is 'MISSING'\n",
        "  # and we are filtering that image out.\n",
        "  def filter_missing_gender(element):\n",
        "    example = tf.train.Example.FromString(element)\n",
        "    if example.features.feature[slice_key].bytes_list.value[0] != b'MISSING':\n",
        "      yield element\n",
        "\n",
        "  filtered_data = (\n",
        "      data\n",
        "      | 'Filter Missing Gender' \u003e\u003e beam.ParDo(filter_missing_gender))\n",
        "  \n",
        "  # Count after filtering \"Gender:MISSING\".\n",
        "  filtered_data_count = (\n",
        "      filtered_data | 'Count number of examples after filtering'\n",
        "      \u003e\u003e beam.CombineGlobally(\n",
        "          CountExamples('After filtering \"Gender:MISSING\"')))\n",
        "\n",
        "  # Because LFW data set has always faces by default, we are adding\n",
        "  # labels as 1.0 for all images.\n",
        "  def add_face_groundtruth(element):\n",
        "    example = tf.train.Example.FromString(element)\n",
        "    example.features.feature[label_key].float_list.value[:] = [1.0]\n",
        "    yield example.SerializeToString()\n",
        "\n",
        "  final_data = (\n",
        "      filtered_data\n",
        "      | 'Add Face Groundtruth' \u003e\u003e beam.ParDo(add_face_groundtruth))\n",
        "\n",
        "  # Run TFMA.\n",
        "  _ = (\n",
        "      final_data\n",
        "      | 'ExtractEvaluateAndWriteResults' \u003e\u003e\n",
        "       tfma.ExtractEvaluateAndWriteResults(\n",
        "                 eval_shared_model=eval_shared_model,\n",
        "                 compute_confidence_intervals=compute_confidence_intervals,\n",
        "                 output_path=tfma_eval_result_path,\n",
        "                 extractors=model_agnostic_extractors))\n",
        "\n",
        "eval_result = tfma.load_eval_result(output_path=tfma_eval_result_path)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ktlASJQIzE3l"
      },
      "source": [
        "# Render Fairness Indicators\n",
        "\n",
        "Render the Fairness Indicators widget with the exported evaluation results.\n",
        "\n",
        "Below you will see bar charts displaying performance of each slice of the data on selected metrics. You can adjust the baseline comparison slice as well as the displayed threshold(s) using the drop down menus at the top of the visualization.\n",
        "\n",
        "A relevant metric for this use case is true positive rate, also known as recall. Use the selector on the left hand side to choose the graph for true_positive_rate. These metric values match the values displayed on the [model card](https://modelcards.withgoogle.com/face-detection).\n",
        "\n",
        "For some photos, gender is labeled as young instead of male or female, if the person in the photo is too young to be accurately annotated."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "JNaNhTCTAMHm"
      },
      "outputs": [],
      "source": [
        "widget_view.render_fairness_indicator(eval_result=eval_result,\n",
        "                                      slicing_column=slice_key)"
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "collapsed_sections": [
        "Sxt-9qpNgPxo"
      ],
      "name": "Facessd Fairness Indicators Example Colab.ipynb",
      "provenance": [],
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
